-
Notifications
You must be signed in to change notification settings - Fork 530
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
To ensure MakeTerminal value overwrites any MCTS calculated value #677
Conversation
Removed the change in search.cc as actually not needed. |
tested appveyor build against v0.20.1 at short tc (10+0.1s) on 1060 (nps 3439 go nodes 130000).
update: seems worse with TBs
|
Probably we are seeing this worse performance because of the following. If there is an unforced mate (e.g. after blunder of opponent, more like a helpmate actually) in a line starting with a suboptimal move and the search has explored this mate, it gets completely sucked into this suboptimal move. The fact that the huge (as enforced in this PR) Q value of an unforced mate greatly influences its parents is wrong. Or am I completely missing something here? |
Hi @ddobbelaere good point, I guess that must be the reason. The latest commit just applied lowers the maxnumber and takes the 'depth' into account. First test applied showed that the original problem for issue #627 is still fixed, but I do have to test this on other positions as well to see if it lowers the amount of trolling. PS: I know this is just a hack because it does not consider any tree-reuse (results of hitting upon same hash position with different moves), so eventually some better solution should have to be found.
|
FYI in lectures from DeepMind they said that they tried some heuristics to make A0 play shortest checkmate, but their attempts decreased average strength. |
This line of code seems to stop trolling:
Tested on position: https://lichess.org/analysis/standard/1B1Q4/6P1/2P5/Q1K5/8/8/8/1k6_w_-_-#0
And it also stops the blunder move from #627
More testing is probably needed though. |
The reason we are seeing bogus evaluations is because of the employed conversion of the root Q value to centipawns in
which implicitly assumes that the magnitude of Q is not bigger than 1. Changing |
That being said, I concur with @mooskagh that the solution to the problem lies elsewhere (certainty propagation for example). Nothing fundamental has changed for unforced mates that tend to pull the search unjustifiedly to a certain inferior move. |
Hi @ddobbelaere thanks for your comments.
|
|
Hi @ddobbelaere Thank you.
Interesting. Might be the fen positions I tested were already winning anyway. The problem of #627 is still there in mainline and I'm curious when the termnal branch gets merged as that seems to fix certain scenario's. |
This is a bit of a stretch, but it does seem to fix #627 and also #572
This is probably because overwriting the MakeTerminal with a value big enough to prevent any MCTS accumulation-visits overshooting the MCTS-Search decided MakeTerminal.
Details of go nodes after applying this as a 'hotfix' as compared to issue #627 shows the right move Bg1-f2 is found and kept after only few nodes: